Sounds of Science is a monthly podcast about beginnings: how a molecule becomes a drug, how a rodent elucidates a disease pathway, how a horseshoe crab morphs into an infection fighter. The podcast is produced by Eureka, the scientific blog of Charles River, a contract research organization for drug discovery and development. Tune in and begin the journey.
Mary Parker:
My name is Mary Parker. Welcome to this episode of Eureka's Sounds of Science.
We are all familiar with the concept of a control group in scientific research, a group that is used as a comparison for the group receiving the drug under investigation. For animal models, this means a group of mice or rats who are kept in the same conditions as the group being tested as much as possible. Same food, same water, same housing, et cetera, but who are not given the drug in question. However, after decades of research using control groups and with the invention of more sophisticated data analysis algorithms, it's time to consider the virtual control group.
Joining me to discuss this topic are Laura Lotfi, associate director of product management for Charles River, and Guillemette Duchateau-Nguyen, senior leader in predictive modeling and data and analytics for Roche.
Welcome, Laura and Guillemette.
Laura Lotfi:
Thank you, Mary.
Guillemette Duchateau-Nguyen:
Thanks, Mary.
Mary Parker:
Thank you both for joining me. Can we start with introductions? Guillemette, can you tell me about your scientific background and your interest in the virtual control groups project?
Guillemette Duchateau-Nguyen:
I am a biostatistician at Roche, in research and early development, supporting drug development with data analytics. For many years, I've been involved in biomarker data analysis in both clinical or pre-clinical setting. I have been involved in IMI Consortium, which is a [inaudible 00:01:38] innovative medicine initiative called eTRANSAFE which prototype, basically, an application involving the reuse of data from control group. So, we started to look into this data in this consortium, and a virtual cultural group expert team was founded. The initial repository was created, containing data from rodent and non-rodent, included in animal talk studies, coming from five different pharma companies.
So, all these companies started to prepare the data, curate, anonymize the data and share the data. To beat an initial repository, that could be used to build virtual control group. We started basically a few years ago on this topic.
Mary Parker:
Okay. Wow, that's perfect. Laura, over to you. So, what is your background, and what exact is a virtual control group?
Laura Lotfi:
Mary, I started as a toxicologist with a specialized experience in safety assessment studies. Most of my research and work was focused on making sure that the drug development pathway is safe and secure for patients and for testing. So, started as a study director at Charles River six years ago, and I was always interested in the meeting between technology and science. During my experience, I was exposed to multiple projects to enable some of those. That led me to the virtual control groups, and... What is it? It's actually a curated matched control data that are collected from previously alive control animal on studies. It was collected during a specific time frame. That included with that some of their time points, end points on a specific study design. So, the virtual control by itself is a collection of that, our repository database that is available to be tapped into and used.
Mary Parker:
For the virtual control group that we developed, is it all historical data, or did they have any new observations that they input to add more data to the system? Did they create any control groups just to populate data for training the algorithm?
Laura Lotfi:
No, they're all historical control data. They are coming from previous study controlled data sets.
Mary Parker:
That's great. I found the paper that you're both listed on, and it mentioned using virtual control groups for toxicity studies. Why were these studies good candidates for training the virtual control groups?
Guillemette Duchateau-Nguyen:
Now, historical data from control group in animal talk studies were used already for comparative purposes to assess the validity and robustness of the study results. So, it's not completely new to use historical data, but now, we would go beyond that and use the historical control to beat the virtual control group, to replace either totally or partially the concurrent control. The animal talk studies that we want to use here are very standardized studies. The same set of end points are used to a measure in the blood. In the quite standard setting, the duration, the species, the strain used, all of these are quite standardized. So, that's the perfect basically candidate to use this virtual control group, I would say.
Mary Parker:
That makes perfect sense. Laura, anything to add?
Laura Lotfi:
Guillemette framed it very well. I think the only addition I would highlight here is that, also, toxicity studies recently have been aligned in a way that the data can be consumed and prepared for such a repository. So, that also was a great win of data alignment that could lead to some of those products.
Mary Parker:
Yeah. That's, I think, of an often overlooked aspect of artificial intelligence, is that the data that you feed it has to be perfect or else it's not going to learn what you want it to learn. So, that's that's a good point. Guillemette, how would a virtual control group work for you and your research at Roche, and what are you trying to achieve from a statistical standpoint?
Guillemette Duchateau-Nguyen:
I guess what I would like to emphasize you is that this virtual control group is not a new concept. It has been used already in clinical studies by external controller are used. So, we can really learn from this experience, even with animals that would be quite different from human, but there are [inaudible 00:06:16]. So, I guess when we have the algorithm ready to create the virtual control group and the qualification procedure is established, we could really think about using this virtual control group to replace the concurrent control group. The first step maybe is to do that in studies first where we don't have concurrent control, so that would be just one answer interpretation of results by adding the virtual control people. So, that could be the first step that we could consider at Roche or any other companies or institutions doing this [inaudible 00:06:49] studies.
From the statistical point of view, I would say the very key step is really to understand the various source of liability in the studies, using exploratory data analysis method, because there are very different factors which could influence the endpoint we measure in animals. They could be at the study level such as a facility where the study takes place, the type of housing, the diet. Many, many factors could influence the result, what you measure, basically any animal.
There are other factors, such as also the instrument, the asset, which are used also, could have an influence. We know that because we have run some tests to reduce some explorations, so we have seen that certain assets have evolved over time. So, for that, we really need a large database to perform this kind of exploration. So, it's what we started to do with this eTRANSAFE consortium, and we want to pursue this effort, basically. Once this step is really performed, we have a better understanding of source of viability, then we can start really looking and refining an algorithm to create the virtual control group. So, that's about that, basically, in two steps, I would say.
Mary Parker:
Laura, how did this idea for VCGs come about, and what kind of work did it take to set it up?
Laura Lotfi:
If you look at it, Mary, from multiple angles, where we are today in the technology and the regulatory shift in the landscape, there is always a need to innovate and push boundaries further. All the work that has mentioned that has been done with the consortium previously has led to this hypothesis to be tested, and obviously, being a CRO, the question posits itself of what can we do with all this control data, what are we not asking that control data to help us further, especially from advanced analytics as well as animal welfare. So, the idea came about from, really, using the data that we have in place to answer some of these scientific questions.
A lot of work is behind this new concept. First of all, we have a responsibility to make sure the scientific riggers is upheld to standard as we don't want to impact that in any way on the patient and the drug development pathways. But the data set in itself, understanding the criticality of certain selection criteria, variability in the data set, keeping the natural variation of a database like this is very important to reflect the reality of what happens in those type of studies, making sure that we don't over customize something to not lead to false positives or false negatives questions. So, the methodology behind it requires a lot of advanced analytics algorithms and statistical approaches as well as biological qualification.
It's key to note here that the methodology is multidisciplinary in a topic like this. In a subject like this, you need not only the statisticians, the data scientists, but as well, SMEs or experts in the field like pathologist or toxicologists or any other scientists data set to weigh in when it comes to data analysis and data use cases. The work behind it has extensive steps and requires so many different domain knowledge evaluation.
Mary Parker:
Is it fair to say that the virtual control group doesn't necessarily save human labor, but it does obviously save mouse models and feeds into the three R's of reduction, for sure... But the monitoring and everything is still going to be up to the same standards as a traditional scientific test.
Laura Lotfi:
Well, there is obviously an ethical impact and a potential reduction of animal used in those control groups. The scientific needs or workload will be as or even more extensive when it comes to higher data sets. Now, that doesn't necessarily always translate in longer timelines. It just translates more in a higher quality in some context, or just when you have access to a larger data set like this, you are able to be more confident in some of those discussions, have access to larger data allows you to see some of the background findings or more new information.
Mary Parker:
Yeah. Back to Guillemette. What do you think are the most obvious benefits and drawbacks of using a virtual control group, especially in your work?
Guillemette Duchateau-Nguyen:
As Laura mentioned, the ethical aspect with the three R's is really a key benefit, considering that you remove partially of totally your concurrent control group. In terms of data quality, as well, we are going to improve and try not to sway the data and the quality of the data that we have in our repository. By going back into old studies, we could create further curate and harmonize the data we have there, provide ways of using this data, not maybe only for virtual control group, but for other purposes.
Finally, the cost could be also impacted by decreasing maybe the cost. That's an option as well. In terms of drawback, I guess we will need to address our way of working because it will be a big change for many of us in terms of how you look at the results, how you interpret those results, taking into account the external control. Even if they are a match, I mean... There would be new ways of looking at the data. We need to learn about that.
And I can see that in certain instances where experimental conditions are very different, very unusual... Let's see about an experiment where a vehicle was never used and then you could expect an impact on the measurement, then not having potentially virtual control group... External control using this vehicle could be a problem. So, maybe you would need still to run this concurrent control or maybe [inaudible 00:13:23] could be another way to tackle this challenge. So, that would be, in a nutshell, I would say the possible benefit and drawback that we could see, and we learn by doing, basically.
Mary Parker:
Yeah.
Laura Lotfi:
Yeah. Mary, if I'm allowed to add a little bit of flavors on some of those. I totally agree with all the benefits that Guillemette just mentioned. Possibly, with this new journey, I think exploring some of the regulatory aspect might become a... It will be, I would say, a challenge that we have to overcome, ensuring that it is aligned with some of those guidelines and regulatory expectations. Definitely, there are a space for some of this work in a non-regulatory space that could be an easy win, but definitely it's something that we need to keep in mind when it comes to those type of studies.
Mary Parker:
I was just going to get into the regulatory aspect next, so good timing. Guillemette, do you have anything to add to the regulatory angle?
Guillemette Duchateau-Nguyen:
I guess what I would like to say, that from the beginning of this journey, when we started in eTRANSAFE, [inaudible 00:14:37] description of a virtual control group. We interacted with them. We had a workshop with [inaudible 00:14:43] one year ago where the FDA was present, where we could really discuss all this aspect, all the challenges, and there is a report published also in the [inaudible 00:14:54] that you mentioned really briefly, Mary. We have also discussion with IMI, so the European agency as well on their approach. I can say that both FDA and IMI are very open to this initiative. We could also further mention that it is aligned with the FDA Modernization Act 2.0, which really arose for an alternative to animal testing. So, I think it's really aligned with that.
VCG are going in the right direction. With this also new proposed consortium that we might have, FDA will be also present their, potentially. I agree with Laura. It will be challenging. We will need to have some work done around this different aspect, different challenges, but I think it's already going in the right direction. That's it.
Mary Parker:
Okay. Just as my last question, can either of you see this project expanding to other kinds of studies, besides toxicity studies?
Guillemette Duchateau-Nguyen:
I guess the problem is that as soon as it's not standardized, the endpoint are not standardized, then it could be an issue then. It relies on that, so I don't know. Maybe, Laura, you have better insight around the type of studies that could benefit from that.
Laura Lotfi:
The way that I see it, the project might be expanding in areas of data sets rather than a whole studies implementation. Things like, we can tap into vehicle library or microscopic finding database libraries, background findings, enabling for example digital pathology host-site images. Out of this project, the expansion is probably going to look and be from a different flavors. From a study type perspective, it will require some data curation and continuous data alignment. There is, on the same time, other efforts to bring some of those virtual control group needs to groups like the CDISC or to send data sets to drive forward some of alignment in areas where that is not yet the case. So, this is where I think some of those discussions can help drive further discussion on alignment when it's not yet there.
Mary Parker:
Yeah. I feel like it would be valuable to have data standardization, even if it doesn't result in another type of virtual control group. I mean, that just seems like a good way to share data, scientifically, if we're all doing it in the same way.
Guillemette Duchateau-Nguyen:
Yeah. I think it will be one of the benefits of the virtual control group that we will improve the way we capture the data. Sometimes, we are missing some metadata studies. It's not that they are missing, but they are not captured in the current database that we have. So, improving the format, the standard around this could be really valuable, and everybody to align on... Yeah. Everybody would deliver the same kind of data content.
Mary Parker:
What will implementation of the virtual control groups look like? How do we get from theory into practice?
Laura Lotfi:
Yeah. Great question, Mary, because there has been a lot of work behind testing a lot of the data analysis, the selection criteria, etc. But how do we see the VCG on a study? Some efforts that we're doing, along with Guillemette as well and other teams, is to do some retrospective exercise, so seeing some of those virtual control group data set along with the actual control on a study, comparing the behavior of those groups, collecting concordances and seeing whether the safety assessment outcome might need any refinement or change with having the VCG there. This will give us enough confidence and some refinement that would probably be needed for the database and selection of those VCG. Then we can take it to the next level of either implementing it next to the concurrent control group for a while or partially replacing some of those concurrent control group by the virtual control groups in the non-regulated areas, and then we will take it to the next step of regulatory, when ready.
Mary Parker:
All right. Well, thank you so much, both of you, for sharing your expertise in this area. It's a fascinating area and I can't wait to see it expand.
Laura Lotfi:
Thank you for having us, Mary. It's really interesting to to share with everybody.
Guillemette Duchateau-Nguyen:
Thank you to both of you.